Data Intake
| Index | stub | file | data_type | taxon_string | translation_table |
|---|---|---|---|---|---|
| 0 | KX808498-truncated | KX808498-truncated.gb | GenBank | Caulerpa_cliftonii_HV03798 | 11 |
| 1 | KY509313-truncated | KY509313-truncated.gb | GenBank | Avrainvillea_mazei_HV02664 | 11 |
| 2 | MH591083-truncated | MH591083-truncated.gb | GenBank | Flabellia_petiolata_HV01202 | 11 |
| 3 | MH591084-truncated | MH591084-truncated.gb | GenBank | Flabellia_petiolata_HV01202 | 11 |
| 4 | MH591085-truncated | MH591085-truncated.gb | GenBank | Flabellia_petiolata_HV01202 | 11 |
| 5 | NC_026795-truncated | NC_026795-truncated.txt | GenBank | Bryopsis_plumosa_WEST4718 | 11 |
| 6 | KY819064-truncated-cds | KY819064-truncated.cds.fasta | CDS | Chlorodesmis_fastigiata_HV03865 | 11 |
| 7 | KX808497-truncated | KX808497-truncated.fa | CDS | Derbesia_sp_WEST4838 | 11 |
Orthofinder
| Name | Value |
|---|---|
| Number of species | 8 |
| Number of genes | 65 |
| Number of genes in orthogroups | 41 |
| Number of unassigned genes | 24 |
| Percentage of genes in orthogroups | 63.1 |
| Percentage of unassigned genes | 36.9 |
| Number of orthogroups | 7 |
| Number of species-specific orthogroups | 0 |
| Number of genes in species-specific orthogroups | 0 |
| Percentage of genes in species-specific orthogroups | 0.0 |
| Mean orthogroup size | 5.9 |
| Median orthogroup size | 6.0 |
| G50 (assigned genes) | 6 |
| G50 (all genes) | 6 |
| O50 (assigned genes) | 4 |
| O50 (all genes) | 6 |
| Number of orthogroups with all species present | 0 |
| Number of single-copy orthogroups | 0 |
| Date | 2024-05-20 |
| Orthogroups file | Orthogroups.tsv |
| Unassigned genes file | Orthogroups_UnassignedGenes.tsv |
| Per-species statistics | Statistics_PerSpecies.tsv |
| Overall statistics | Statistics_Overall.tsv |
| Orthogroups shared between species | Orthogroups_SpeciesOverlaps.tsv |
Average number of genes per-species in orthogroup
| None | Average number of genes per-species in orthogroup | Number of orthogroups | Percentage of orthogroups | Number of genes | Percentage of genes |
|---|---|---|---|---|---|
| 0 | <1 | 7 | 100.0 | 41 | 100.0 |
| 1 | '1 | 0 | 0.0 | 0 | 0.0 |
| 2 | '2 | 0 | 0.0 | 0 | 0.0 |
| 3 | '3 | 0 | 0.0 | 0 | 0.0 |
| 4 | '4 | 0 | 0.0 | 0 | 0.0 |
| 5 | '5 | 0 | 0.0 | 0 | 0.0 |
| 6 | '6 | 0 | 0.0 | 0 | 0.0 |
| 7 | '7 | 0 | 0.0 | 0 | 0.0 |
| 8 | '8 | 0 | 0.0 | 0 | 0.0 |
| 9 | '9 | 0 | 0.0 | 0 | 0.0 |
| 10 | '10 | 0 | 0.0 | 0 | 0.0 |
| 11 | 11-15 | 0 | 0.0 | 0 | 0.0 |
| 12 | 16-20 | 0 | 0.0 | 0 | 0.0 |
| 13 | 21-50 | 0 | 0.0 | 0 | 0.0 |
| 14 | 51-100 | 0 | 0.0 | 0 | 0.0 |
| 15 | 101-150 | 0 | 0.0 | 0 | 0.0 |
| 16 | 151-200 | 0 | 0.0 | 0 | 0.0 |
| 17 | 201-500 | 0 | 0.0 | 0 | 0.0 |
| 18 | 501-1000 | 0 | 0.0 | 0 | 0.0 |
| 19 | '1001+ | 0 | 0.0 | 0 | 0.0 |
Number of Orthogroups vs. Number of OGs
| None | Number of genes | Number of genes in orthogroups | Number of unassigned genes | Percentage of genes in orthogroups | Percentage of unassigned genes | Number of orthogroups containing species | Percentage of orthogroups containing species | Number of species-specific orthogroups | Number of genes in species-specific orthogroups | Percentage of genes in species-specific orthogroups |
|---|---|---|---|---|---|---|---|---|---|---|
| KX808497-truncated.translated | 7.0 | 7.0 | 0.0 | 100.0 | 0.0 | 7.0 | 100.0 | 0.0 | 0.0 | 0.0 |
| KX808498-truncated.translated | 31.0 | 8.0 | 23.0 | 25.8 | 74.2 | 7.0 | 100.0 | 0.0 | 0.0 | 0.0 |
| KY509313-truncated.translated | 7.0 | 7.0 | 0.0 | 100.0 | 0.0 | 7.0 | 100.0 | 0.0 | 0.0 | 0.0 |
| KY819064-truncated-cds.translated | 7.0 | 7.0 | 0.0 | 100.0 | 0.0 | 7.0 | 100.0 | 0.0 | 0.0 | 0.0 |
| MH591083-truncated.translated | 3.0 | 3.0 | 0.0 | 100.0 | 0.0 | 3.0 | 42.9 | 0.0 | 0.0 | 0.0 |
| MH591084-truncated.translated | 2.0 | 2.0 | 0.0 | 100.0 | 0.0 | 2.0 | 28.6 | 0.0 | 0.0 | 0.0 |
| MH591085-truncated.translated | 1.0 | 0.0 | 1.0 | 0.0 | 100.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| NC_026795-truncated.translated | 7.0 | 7.0 | 0.0 | 100.0 | 0.0 | 7.0 | 100.0 | 0.0 | 0.0 | 0.0 |
results/orthofinder/output/Orthogroup_Sequences/OG0000000.fa results/orthofinder/output/Orthogroup_Sequences/OG0000001.fa results/orthofinder/output/Orthogroup_Sequences/OG0000002.fa results/orthofinder/output/Orthogroup_Sequences/OG0000003.fa results/orthofinder/output/Orthogroup_Sequences/OG0000004.fa results/orthofinder/output/Orthogroup_Sequences/OG0000006.fa
results/orthofinder/orthosnap/OG0000005/OG0000005_orthosnap_0.fa
Alignment
results/alignment/trimmed_cds/OG0000000.trimmed.cds.alignment.fa results/alignment/trimmed_cds/OG0000004.trimmed.cds.alignment.fa
Supermatrix
Newick Format
(Avrainvillea_mazei_HV02664:0.2083418034,(Bryopsis_plumosa_WEST4718:0.1431663088,Derbesia_sp_WEST4838:0.1843595331)100:0.0749149964,(Caulerpa_cliftonii_HV03798:0.1700810595,(Chlorodesmis_fastigiata_HV03865:0.0968430593,Flabellia_petiolata_HV01202:0.0777149909)98:0.0437234131)100:0.0864127447);
Newick Format
(Avrainvillea_mazei_HV02664:0.2083451972,(Bryopsis_plumosa_WEST4718:0.1431632303,Derbesia_sp_WEST4838:0.1843631128)100:0.0749217509,(Caulerpa_cliftonii_HV03798:0.1701007233,(Chlorodesmis_fastigiata_HV03865:0.0968536535,Flabellia_petiolata_HV01202:0.0777184225)98:0.0437063654)100:0.0864044195);
General Characteristics ======================= 6 Number of taxa 2388 Alignment length 496 Parsimony informative sites 496 Variable sites 1882 Constant sites Character Frequencies ===================== T 4524 G 2766 C 2251 A 4643 - 144
IQ-TREE 2.2.0.3 COVID-edition built Aug 2 2022
Input file name: results/supermatrix/supermatrix.cds.fa
Type of analysis: ModelFinder + tree reconstruction + ultrafast bootstrap (1000 replicates)
Random seed number: 792646
REFERENCES
----------
To cite IQ-TREE please use:
Bui Quang Minh, Heiko A. Schmidt, Olga Chernomor, Dominik Schrempf,
Michael D. Woodhams, Arndt von Haeseler, and Robert Lanfear (2020)
IQ-TREE 2: New models and efficient methods for phylogenetic inference
in the genomic era. Mol. Biol. Evol., in press.
https://doi.org/10.1093/molbev/msaa015
To cite ModelFinder please use:
Subha Kalyaanamoorthy, Bui Quang Minh, Thomas KF Wong, Arndt von Haeseler,
and Lars S Jermiin (2017) ModelFinder: Fast model selection for
accurate phylogenetic estimates. Nature Methods, 14:587–589.
https://doi.org/10.1038/nmeth.4285
Since you used ultrafast bootstrap (UFBoot) please also cite:
Diep Thi Hoang, Olga Chernomor, Arndt von Haeseler, Bui Quang Minh,
and Le Sy Vinh (2018) UFBoot2: Improving the ultrafast bootstrap
approximation. Mol. Biol. Evol., 35:518–522.
https://doi.org/10.1093/molbev/msx281
SEQUENCE ALIGNMENT
------------------
Input data: 6 sequences with 2388 nucleotide sites
Number of constant sites: 1400 (= 58.6265% of all sites)
Number of invariant (constant or ambiguous constant) sites: 1400 (= 58.6265% of all sites)
Number of parsimony informative sites: 496
Number of distinct site patterns: 492
ModelFinder
-----------
Best-fit model according to BIC: GTR+F+G4
List of models sorted by BIC scores:
Model LogL AIC w-AIC AICc w-AICc BIC w-BIC
GTR+F+G4 -9950.622 19937.245 + 0.254 19937.534 + 0.257 20041.253 + 0.464
TIM2+F+G4 -9958.889 19949.778 - 0.000483 19950.007 - 0.000503 20042.229 + 0.285
TIM2+F+I -9960.089 19952.178 - 0.000145 19952.407 - 0.000152 20044.629 + 0.0857
GTR+F+I+G4 -9948.579 19935.159 + 0.721 19935.480 + 0.718 20044.945 + 0.0732
TIM2+F+I+G4 -9956.690 19947.381 - 0.0016 19947.639 - 0.00164 20045.610 + 0.0525
GTR+F+I -9953.065 19942.130 - 0.0221 19942.419 - 0.0224 20046.138 - 0.0403
TPM2u+F+I -9976.669 19983.337 - 2.49e-11 19983.540 - 2.63e-11 20070.011 - 2.64e-07
TVM+F+I -9969.542 19973.084 - 4.2e-09 19973.343 - 4.31e-09 20071.314 - 1.38e-07
TVM+F+G4 -9970.077 19974.155 - 2.46e-09 19974.413 - 2.52e-09 20072.384 - 8.05e-08
TPM2u+F+G4 -9978.794 19987.588 - 2.97e-12 19987.790 - 3.14e-12 20074.261 - 3.15e-08
TVM+F+I+G4 -9968.124 19972.249 - 6.37e-09 19972.537 - 6.45e-09 20076.256 - 1.16e-08
TIM+F+G4 -9976.342 19984.683 - 1.27e-11 19984.913 - 1.32e-11 20077.135 - 7.49e-09
TPM2u+F+I+G4 -9976.786 19985.571 - 8.15e-12 19985.801 - 8.49e-12 20078.023 - 4.81e-09
TIM+F+I -9978.268 19988.535 - 1.85e-12 19988.765 - 1.93e-12 20080.987 - 1.09e-09
TIM+F+I+G4 -9974.947 19983.894 - 1.89e-11 19984.152 - 1.94e-11 20082.123 - 6.18e-10
K3Pu+F+I -9994.574 20019.149 - 4.17e-19 20019.351 - 4.4e-19 20105.822 - 4.42e-15
K3Pu+F+G4 -9997.747 20025.495 - 1.75e-20 20025.697 - 1.84e-20 20112.168 - 1.85e-16
K3Pu+F+I+G4 -9996.293 20024.586 - 2.75e-20 20024.816 - 2.87e-20 20117.038 - 1.62e-17
TN+F+G4 -10003.782 20037.564 - 4.18e-23 20037.767 - 4.41e-23 20124.237 - 4.43e-19
TN+F+I -10005.761 20041.522 - 5.78e-24 20041.724 - 6.1e-24 20128.195 - 6.12e-20
TN+F+I+G4 -10002.313 20036.626 - 6.68e-23 20036.856 - 6.96e-23 20129.078 - 3.94e-20
TIM3+F+G4 -10003.134 20038.269 - 2.94e-23 20038.498 - 3.06e-23 20130.720 - 1.73e-20
TIM3+F+I -10004.794 20041.588 - 5.59e-24 20041.817 - 5.83e-24 20134.039 - 3.3e-21
TIM3+F+I+G4 -10001.831 20037.662 - 3.98e-23 20037.920 - 4.09e-23 20135.892 - 1.31e-21
HKY+F+I -10020.421 20068.842 - 6.75e-30 20069.019 - 7.22e-30 20149.737 - 1.29e-24
TPM3u+F+I -10019.385 20068.770 - 7e-30 20068.972 - 7.39e-30 20155.443 - 7.41e-26
HKY+F+G4 -10023.678 20075.356 - 2.6e-31 20075.533 - 2.78e-31 20156.251 - 4.95e-26
HKY+F+I+G4 -10022.196 20074.393 - 4.21e-31 20074.595 - 4.44e-31 20161.066 - 4.46e-27
TPM3u+F+G4 -10022.668 20075.336 - 2.63e-31 20075.538 - 2.77e-31 20162.009 - 2.78e-27
TPM3u+F+I+G4 -10021.313 20074.626 - 3.74e-31 20074.855 - 3.9e-31 20167.077 - 2.21e-28
F81+F+I -10041.407 20108.815 - 1.41e-38 20108.968 - 1.53e-38 20183.932 - 4.83e-32
F81+F+G4 -10046.374 20118.748 - 9.82e-41 20118.901 - 1.06e-40 20193.865 - 3.36e-34
F81+F+I+G4 -10044.860 20117.720 - 1.64e-40 20117.897 - 1.76e-40 20198.615 - 3.13e-35
SYM+I -10106.619 20243.238 - 9.11e-68 20243.440 - 9.62e-68 20329.911 - 9.66e-64
TVMe+I -10112.050 20252.100 - 1.08e-69 20252.277 - 1.16e-69 20332.995 - 2.07e-64
SYM+G4 -10109.264 20248.529 - 6.47e-69 20248.731 - 6.83e-69 20335.202 - 6.85e-65
SYM+I+G4 -10105.978 20243.957 - 6.36e-68 20244.186 - 6.63e-68 20336.408 - 3.75e-65
TVMe+G4 -10115.760 20259.521 - 2.65e-71 20259.698 - 2.84e-71 20340.416 - 5.06e-66
TVMe+I+G4 -10112.239 20254.478 - 3.3e-70 20254.680 - 3.49e-70 20341.151 - 3.5e-66
TIM2e+I -10159.736 20345.473 - 5.75e-90 20345.626 - 6.22e-90 20420.589 - 1.97e-83
TPM2+I -10164.413 20352.827 - 1.45e-91 20352.958 - 1.59e-91 20422.165 - 8.96e-84
TIM2e+I+G4 -10162.604 20353.209 - 1.2e-91 20353.386 - 1.29e-91 20434.104 - 2.29e-86
TIM2e+G4 -10166.496 20358.992 - 6.67e-93 20359.145 - 7.22e-93 20434.109 - 2.28e-86
TPM2+I+G4 -10167.174 20360.347 - 3.39e-93 20360.500 - 3.67e-93 20435.464 - 1.16e-86
TPM2+G4 -10171.100 20366.199 - 1.82e-94 20366.330 - 1.99e-94 20435.538 - 1.12e-86
TIMe+I -10187.704 20401.408 - 4.11e-102 20401.562 - 4.44e-102 20476.525 - 1.41e-95
K3P+I -10193.175 20410.349 - 4.7e-104 20410.480 - 5.14e-104 20479.688 - 2.89e-96
TIMe+G4 -10195.091 20416.182 - 2.54e-105 20416.335 - 2.75e-105 20491.298 - 8.71e-99
TIMe+I+G4 -10192.910 20413.821 - 8.28e-105 20413.998 - 8.86e-105 20494.716 - 1.58e-99
K3P+G4 -10203.234 20430.468 - 2.01e-108 20430.600 - 2.2e-108 20499.807 - 1.24e-100
K3P+I+G4 -10200.723 20427.446 - 9.11e-108 20427.600 - 9.86e-108 20502.563 - 3.12e-101
TIM3e+I -10233.779 20493.557 - 4.01e-122 20493.711 - 4.34e-122 20568.674 - 1.37e-115
TIM3e+G4 -10234.984 20495.968 - 1.2e-122 20496.121 - 1.3e-122 20571.085 - 4.12e-116
TPM3+I -10240.588 20505.176 - 1.2e-124 20505.307 - 1.32e-124 20574.515 - 7.41e-117
TIM3e+I+G4 -10233.164 20494.327 - 2.73e-122 20494.504 - 2.92e-122 20575.222 - 5.2e-117
TPM3+G4 -10246.096 20516.192 - 4.88e-127 20516.324 - 5.34e-127 20585.531 - 3e-119
TPM3+I+G4 -10243.636 20513.272 - 2.1e-126 20513.425 - 2.27e-126 20588.389 - 7.2e-120
GTR+F -10231.559 20497.118 - 6.77e-123 20497.376 - 6.95e-123 20595.348 - 2.22e-121
TNe+I -10266.205 20556.409 - 9.02e-136 20556.541 - 9.87e-136 20625.748 - 5.56e-128
JC+I -10275.298 20570.596 - 7.49e-139 20570.689 - 8.36e-139 20628.378 - 1.49e-128
K2P+I -10272.035 20566.071 - 7.2e-138 20566.182 - 7.96e-138 20629.631 - 7.97e-129
TNe+G4 -10273.280 20570.561 - 7.63e-139 20570.692 - 8.35e-139 20639.899 - 4.7e-131
TNe+I+G4 -10271.035 20568.070 - 2.65e-138 20568.223 - 2.87e-138 20643.186 - 9.08e-132
K2P+G4 -10281.744 20585.488 - 4.37e-142 20585.599 - 4.84e-142 20649.049 - 4.84e-133
JC+G4 -10285.852 20591.704 - 1.96e-143 20591.796 - 2.18e-143 20649.486 - 3.89e-133
K2P+I+G4 -10279.249 20582.498 - 1.95e-141 20582.630 - 2.14e-141 20651.837 - 1.2e-133
JC+I+G4 -10283.324 20588.649 - 9.01e-143 20588.760 - 9.96e-143 20652.209 - 9.97e-134
AIC, w-AIC : Akaike information criterion scores and weights.
AICc, w-AICc : Corrected AIC scores and weights.
BIC, w-BIC : Bayesian information criterion scores and weights.
Plus signs denote the 95% confidence sets.
Minus signs denote significant exclusion.
SUBSTITUTION PROCESS
--------------------
Model of substitution: GTR+F+G4
Rate parameter R:
A-C: 2.6505
A-G: 2.9500
A-T: 3.9453
C-G: 1.9541
C-T: 6.1752
G-T: 1.0000
State frequencies: (empirical counts from alignment)
pi(A) = 0.3273
pi(C) = 0.1587
pi(G) = 0.195
pi(T) = 0.319
Rate matrix Q:
A -0.9608 0.1793 0.2452 0.5363
C 0.3698 -1.372 0.1624 0.8394
G 0.4116 0.1322 -0.6797 0.1359
T 0.5504 0.4177 0.08311 -1.051
Model of rate heterogeneity: Gamma with 4 categories
Gamma shape alpha: 0.3949
Category Relative_rate Proportion
1 0.01598 0.25
2 0.1779 0.25
3 0.7257 0.25
4 3.08 0.25
Relative rates are computed as MEAN of the portion of the Gamma distribution falling in the category.
MAXIMUM LIKELIHOOD TREE
-----------------------
Log-likelihood of the tree: -9950.6201 (s.e. 142.9199)
Unconstrained log-likelihood (without tree): -9284.4483
Number of free parameters (#branches + #model parameters): 18
Akaike information criterion (AIC) score: 19937.2401
Corrected Akaike information criterion (AICc) score: 19937.5288
Bayesian information criterion (BIC) score: 20041.2479
Total tree length (sum of branch lengths): 1.0856
Sum of internal branch lengths: 0.2051 (18.8890% of tree length)
NOTE: Tree is UNROOTED although outgroup taxon 'Avrainvillea_mazei_HV02664' is drawn at root
Numbers in parentheses are ultrafast bootstrap support (%)
+-----------------------------------------------Avrainvillea_mazei_HV02664
|
| +--------------------------------Bryopsis_plumosa_WEST4718
+----------------| (100)
| +-----------------------------------------Derbesia_sp_WEST4838
|
| +--------------------------------------Caulerpa_cliftonii_HV03798
+------------------| (100)
| +---------------------Chlorodesmis_fastigiata_HV03865
+---------| (98)
+----------------Flabellia_petiolata_HV01202
Tree in newick format:
(Avrainvillea_mazei_HV02664:0.2083418034,(Bryopsis_plumosa_WEST4718:0.1431663088,Derbesia_sp_WEST4838:0.1843595331)100:0.0749149964,(Caulerpa_cliftonii_HV03798:0.1700810595,(Chlorodesmis_fastigiata_HV03865:0.0968430593,Flabellia_petiolata_HV01202:0.0777149909)98:0.0437234131)100:0.0864127447);
CONSENSUS TREE
--------------
Consensus tree is constructed from 1000 bootstrap trees
Log-likelihood of consensus tree: -9950.620052
Robinson-Foulds distance between ML tree and consensus tree: 0
Branches with support >0.000000% are kept (extended consensus)
Branch lengths are optimized by maximum likelihood on original alignment
Numbers in parentheses are bootstrap supports (%)
+-----------------------------------------------Avrainvillea_mazei_HV02664
|
| +--------------------------------Bryopsis_plumosa_WEST4718
+----------------| (100)
| +-----------------------------------------Derbesia_sp_WEST4838
|
| +--------------------------------------Caulerpa_cliftonii_HV03798
+------------------| (100)
| +---------------------Chlorodesmis_fastigiata_HV03865
+---------| (98)
+----------------Flabellia_petiolata_HV01202
Consensus tree in newick format:
(Avrainvillea_mazei_HV02664:0.2083451972,(Bryopsis_plumosa_WEST4718:0.1431632303,Derbesia_sp_WEST4838:0.1843631128)100:0.0749217509,(Caulerpa_cliftonii_HV03798:0.1701007233,(Chlorodesmis_fastigiata_HV03865:0.0968536535,Flabellia_petiolata_HV01202:0.0777184225)98:0.0437063654)100:0.0864044195);
TIME STAMP
----------
Date and time: Mon May 20 06:53:14 2024
Total CPU time used: 0.713534 seconds (0h:0m:0s)
Total wall-clock time used: 0.7210692549 seconds (0h:0m:0s)
IQ-TREE multicore version 2.2.0.3 COVID-edition for Linux 64-bit built Aug 2 2022
Developed by Bui Quang Minh, James Barbetti, Nguyen Lam Tung,
Olga Chernomor, Heiko Schmidt, Dominik Schrempf, Michael Woodhams, Ly Trong Nhan.
Host: mrc-hvlab (AVX2, FMA3, 251 GB RAM)
Command: iqtree2 -s results/supermatrix/supermatrix.cds.fa -bb 1000 -m TEST -nt 1 -redo -pre results/supermatrix/supermatrix.cds
Seed: 792646 (Using SPRNG - Scalable Parallel Random Number Generator)
Time: Mon May 20 06:53:13 2024
Kernel: AVX+FMA - 1 threads (64 CPU cores detected)
HINT: Use -nt option to specify number of threads because your CPU has 64 cores!
HINT: -nt AUTO will automatically determine the best number of threads to use.
Reading alignment file results/supermatrix/supermatrix.cds.fa ... Fasta format detected
Reading fasta file: done in 0.000392847 secs using 99.53% CPU
Alignment most likely contains DNA/RNA sequences
Constructing alignment: done in 0.00100483 secs using 88.97% CPU
Alignment has 6 sequences with 2388 columns, 492 distinct patterns
496 parsimony-informative, 492 singleton sites, 1399 constant sites
Gap/Ambiguity Composition p-value
Analyzing sequences: done in 9.87761e-06 secs
1 Avrainvillea_mazei_HV02664 1.13% passed 34.31%
2 Bryopsis_plumosa_WEST4718 0.38% passed 73.08%
3 Caulerpa_cliftonii_HV03798 0.00% passed 67.19%
4 Chlorodesmis_fastigiata_HV03865 1.51% passed 91.03%
5 Derbesia_sp_WEST4838 0.63% passed 90.09%
6 Flabellia_petiolata_HV01202 2.39% passed 90.54%
**** TOTAL 1.01% 0 sequences failed composition chi2 test (p-value<5%; df=3)
Checking for duplicate sequences: done in 2.26814e-05 secs
Create initial parsimony tree by phylogenetic likelihood library (PLL)... 0.000 seconds
Perform fast likelihood tree search using GTR+I+G model...
Estimate model parameters (epsilon = 5.000)
Perform nearest neighbor interchange...
Optimizing NNI: done in 0.00119415 secs using 99.82% CPU
Estimate model parameters (epsilon = 1.000)
1. Initial log-likelihood: -9948.592
Optimal log-likelihood: -9948.559
Rate parameters: A-C: 2.77621 A-G: 3.02264 A-T: 4.05715 C-G: 2.02910 C-T: 6.35200 G-T: 1.00000
Base frequencies: A: 0.327 C: 0.159 G: 0.195 T: 0.319
Proportion of invariable sites: 0.294
Gamma shape alpha: 0.898
Parameters optimization took 1 rounds (0.002 sec)
Time for fast ML tree search: 0.015 seconds
NOTE: ModelFinder requires 2 MB RAM!
ModelFinder will test up to 88 DNA models (sample size: 2388) ...
No. Model -LnL df AIC AICc BIC
1 GTR+F 10231.559 17 20497.118 20497.376 20595.348
2 GTR+F+I 9953.065 18 19942.130 19942.419 20046.138
3 GTR+F+G4 9950.622 18 19937.245 19937.534 20041.253
4 GTR+F+I+G4 9948.579 19 19935.159 19935.480 20044.945
6 SYM+I 10106.619 15 20243.238 20243.440 20329.911
7 SYM+G4 10109.264 15 20248.529 20248.731 20335.202
8 SYM+I+G4 10105.978 16 20243.957 20244.186 20336.408
10 TVM+F+I 9969.542 17 19973.084 19973.343 20071.314
11 TVM+F+G4 9970.077 17 19974.155 19974.413 20072.384
12 TVM+F+I+G4 9968.124 18 19972.249 19972.537 20076.256
14 TVMe+I 10112.050 14 20252.100 20252.277 20332.995
15 TVMe+G4 10115.760 14 20259.521 20259.698 20340.416
16 TVMe+I+G4 10112.239 15 20254.478 20254.680 20341.151
18 TIM3+F+I 10004.794 16 20041.588 20041.817 20134.039
19 TIM3+F+G4 10003.134 16 20038.269 20038.498 20130.720
20 TIM3+F+I+G4 10001.831 17 20037.662 20037.920 20135.892
22 TIM3e+I 10233.779 13 20493.557 20493.711 20568.674
23 TIM3e+G4 10234.984 13 20495.968 20496.121 20571.085
24 TIM3e+I+G4 10233.164 14 20494.327 20494.504 20575.222
26 TIM2+F+I 9960.089 16 19952.178 19952.407 20044.629
27 TIM2+F+G4 9958.889 16 19949.778 19950.007 20042.229
28 TIM2+F+I+G4 9956.690 17 19947.381 19947.639 20045.610
30 TIM2e+I 10159.736 13 20345.473 20345.626 20420.589
31 TIM2e+G4 10166.496 13 20358.992 20359.145 20434.109
32 TIM2e+I+G4 10162.604 14 20353.209 20353.386 20434.104
34 TIM+F+I 9978.268 16 19988.535 19988.765 20080.987
35 TIM+F+G4 9976.342 16 19984.683 19984.913 20077.135
36 TIM+F+I+G4 9974.947 17 19983.894 19984.152 20082.123
38 TIMe+I 10187.704 13 20401.408 20401.562 20476.525
39 TIMe+G4 10195.091 13 20416.182 20416.335 20491.298
40 TIMe+I+G4 10192.910 14 20413.821 20413.998 20494.716
42 TPM3u+F+I 10019.385 15 20068.770 20068.972 20155.443
43 TPM3u+F+G4 10022.668 15 20075.336 20075.538 20162.009
44 TPM3u+F+I+G4 10021.313 16 20074.626 20074.855 20167.077
46 TPM3+I 10240.588 12 20505.176 20505.307 20574.515
47 TPM3+G4 10246.096 12 20516.192 20516.324 20585.531
48 TPM3+I+G4 10243.636 13 20513.272 20513.425 20588.389
50 TPM2u+F+I 9976.669 15 19983.337 19983.540 20070.011
51 TPM2u+F+G4 9978.794 15 19987.588 19987.790 20074.261
52 TPM2u+F+I+G4 9976.786 16 19985.571 19985.801 20078.023
54 TPM2+I 10164.413 12 20352.827 20352.958 20422.165
55 TPM2+G4 10171.099 12 20366.199 20366.330 20435.538
56 TPM2+I+G4 10167.174 13 20360.347 20360.500 20435.464
58 K3Pu+F+I 9994.574 15 20019.149 20019.351 20105.822
59 K3Pu+F+G4 9997.747 15 20025.495 20025.697 20112.168
60 K3Pu+F+I+G4 9996.293 16 20024.586 20024.816 20117.038
62 K3P+I 10193.175 12 20410.349 20410.480 20479.688
63 K3P+G4 10203.234 12 20430.468 20430.600 20499.807
64 K3P+I+G4 10200.723 13 20427.446 20427.600 20502.563
66 TN+F+I 10005.761 15 20041.522 20041.724 20128.195
67 TN+F+G4 10003.782 15 20037.564 20037.767 20124.237
68 TN+F+I+G4 10002.313 16 20036.626 20036.856 20129.078
70 TNe+I 10266.205 12 20556.409 20556.541 20625.748
71 TNe+G4 10273.280 12 20570.561 20570.692 20639.899
72 TNe+I+G4 10271.035 13 20568.070 20568.223 20643.186
74 HKY+F+I 10020.421 14 20068.842 20069.019 20149.737
75 HKY+F+G4 10023.678 14 20075.356 20075.533 20156.251
76 HKY+F+I+G4 10022.196 15 20074.393 20074.595 20161.066
78 K2P+I 10272.035 11 20566.071 20566.182 20629.631
79 K2P+G4 10281.744 11 20585.488 20585.599 20649.049
80 K2P+I+G4 10279.249 12 20582.498 20582.630 20651.837
82 F81+F+I 10041.407 13 20108.815 20108.968 20183.932
83 F81+F+G4 10046.374 13 20118.748 20118.901 20193.865
84 F81+F+I+G4 10044.860 14 20117.720 20117.897 20198.615
86 JC+I 10275.298 10 20570.596 20570.689 20628.378
87 JC+G4 10285.852 10 20591.704 20591.796 20649.486
88 JC+I+G4 10283.324 11 20588.649 20588.760 20652.209
Akaike Information Criterion: GTR+F+I+G4
Corrected Akaike Information Criterion: GTR+F+I+G4
Bayesian Information Criterion: GTR+F+G4
Best-fit model: GTR+F+G4 chosen according to BIC
All model information printed to results/supermatrix/supermatrix.cds.model.gz
CPU time for ModelFinder: 0.457 seconds (0h:0m:0s)
Wall-clock time for ModelFinder: 0.463 seconds (0h:0m:0s)
Generating 1000 samples for ultrafast bootstrap (seed: 792646)...
NOTE: 2 MB RAM (0 GB) is required!
Estimate model parameters (epsilon = 0.100)
1. Initial log-likelihood: -9950.622
Optimal log-likelihood: -9950.621
Rate parameters: A-C: 2.64999 A-G: 2.93589 A-T: 3.95010 C-G: 1.95186 C-T: 6.16638 G-T: 1.00000
Base frequencies: A: 0.327 C: 0.159 G: 0.195 T: 0.319
Gamma shape alpha: 0.395
Parameters optimization took 1 rounds (0.002 sec)
Wrote distance file to...
Computing ML distances based on estimated model parameters...
Calculating distance matrix: done in 0.000193194 secs using 99.38% CPU
Computing ML distances took 0.000249 sec (of wall-clock time) 0.000251 sec (of CPU time)
Setting up auxiliary I and S matrices: done in 4.19477e-05 secs using 100.1% CPU
Constructing RapidNJ tree: done in 2.9075e-05 secs using 99.74% CPU
Computing RapidNJ tree took 0.000163 sec (of wall-clock time) 0.000162 sec (of CPU time)
Log-likelihood of RapidNJ tree: -9985.567
--------------------------------------------------------------------
| INITIALIZING CANDIDATE TREE SET |
--------------------------------------------------------------------
Generating 98 parsimony trees... 0.037 second
Computing log-likelihood of 5 initial trees ... 0.002 seconds
Current best score: -9950.621
Do NNI search on 7 best initial trees
Optimizing NNI: done in 0.00178874 secs using 99.85% CPU
Optimizing NNI: done in 0.0036082 secs using 99.97% CPU
Optimizing NNI: done in 0.00347201 secs using 99.97% CPU
Optimizing NNI: done in 0.00342017 secs using 99.97% CPU
Optimizing NNI: done in 0.00337928 secs using 99.96% CPU
Optimizing NNI: done in 0.00337344 secs using 99.96% CPU
Optimizing NNI: done in 0.00336761 secs using 99.95% CPU
Finish initializing candidate tree set (7)
Current best tree score: -9950.621 / CPU time: 0.065
Number of iterations: 7
--------------------------------------------------------------------
| OPTIMIZING CANDIDATE TREE SET |
--------------------------------------------------------------------
Optimizing NNI: done in 0.00300427 secs using 99.92% CPU
Optimizing NNI: done in 0.00302724 secs using 99.96% CPU
Optimizing NNI: done in 0.00473536 secs using 99.97% CPU
Iteration 10 / LogL: -9950.971 / Time: 0h:0m:0s
Optimizing NNI: done in 0.00467154 secs using 99.97% CPU
Optimizing NNI: done in 0.00455905 secs using 99.98% CPU
Optimizing NNI: done in 0.00475695 secs using 99.96% CPU
Optimizing NNI: done in 0.00289757 secs using 99.98% CPU
UPDATE BEST LOG-LIKELIHOOD: -9950.621
Optimizing NNI: done in 0.00466892 secs using 99.96% CPU
Optimizing NNI: done in 0.00128356 secs using 99.96% CPU
Optimizing NNI: done in 0.00455021 secs using 99.97% CPU
Optimizing NNI: done in 0.00300627 secs using 99.96% CPU
Optimizing NNI: done in 0.00126959 secs using 99.87% CPU
Optimizing NNI: done in 0.00289305 secs using 99.96% CPU
Iteration 20 / LogL: -9950.648 / Time: 0h:0m:0s
Optimizing NNI: done in 0.00277986 secs using 99.97% CPU
UPDATE BEST LOG-LIKELIHOOD: -9950.621
Optimizing NNI: done in 0.00134136 secs using 99.9% CPU
Optimizing NNI: done in 0.00443366 secs using 99.99% CPU
Optimizing NNI: done in 0.0013359 secs using 99.86% CPU
Optimizing NNI: done in 0.00467675 secs using 99.98% CPU
Optimizing NNI: done in 0.00472951 secs using 99.97% CPU
Optimizing NNI: done in 0.00293947 secs using 99.98% CPU
Optimizing NNI: done in 0.00132132 secs using 99.9% CPU
Optimizing NNI: done in 0.00290764 secs using 99.94% CPU
Optimizing NNI: done in 0.00133529 secs using 99.83% CPU
Iteration 30 / LogL: -9960.082 / Time: 0h:0m:0s (0h:0m:0s left)
Optimizing NNI: done in 0.00128849 secs using 99.88% CPU
Optimizing NNI: done in 0.00449043 secs using 99.97% CPU
Optimizing NNI: done in 0.00297859 secs using 99.95% CPU
Optimizing NNI: done in 0.00290036 secs using 99.95% CPU
Optimizing NNI: done in 0.0028991 secs using 99.93% CPU
Optimizing NNI: done in 0.00293896 secs using 99.97% CPU
Optimizing NNI: done in 0.00459621 secs using 99.95% CPU
Optimizing NNI: done in 0.00296775 secs using 99.97% CPU
Optimizing NNI: done in 0.00297694 secs using 99.93% CPU
Optimizing NNI: done in 0.00129239 secs using 99.89% CPU
Iteration 40 / LogL: -9958.083 / Time: 0h:0m:0s (0h:0m:0s left)
Optimizing NNI: done in 0.00454691 secs using 99.98% CPU
Optimizing NNI: done in 0.00298226 secs using 99.96% CPU
Optimizing NNI: done in 0.00286351 secs using 99.95% CPU
UPDATE BEST LOG-LIKELIHOOD: -9950.621
Optimizing NNI: done in 0.00132504 secs using 99.85% CPU
Optimizing NNI: done in 0.00289849 secs using 99.95% CPU
Optimizing NNI: done in 0.00295347 secs using 99.95% CPU
Optimizing NNI: done in 0.00461422 secs using 99.97% CPU
Optimizing NNI: done in 0.00290266 secs using 99.98% CPU
Optimizing NNI: done in 0.00292651 secs using 99.95% CPU
Optimizing NNI: done in 0.00296604 secs using 99.97% CPU
Iteration 50 / LogL: -9950.629 / Time: 0h:0m:0s (0h:0m:0s left)
Log-likelihood cutoff on original alignment: -9979.321
Optimizing NNI: done in 0.0013095 secs using 99.89% CPU
Optimizing NNI: done in 0.00296937 secs using 99.95% CPU
Optimizing NNI: done in 0.00273645 secs using 99.95% CPU
Optimizing NNI: done in 0.00466288 secs using 99.98% CPU
Optimizing NNI: done in 0.00127933 secs using 99.9% CPU
Optimizing NNI: done in 0.00453389 secs using 99.98% CPU
Optimizing NNI: done in 0.00299778 secs using 99.94% CPU
Optimizing NNI: done in 0.00442852 secs using 99.97% CPU
Optimizing NNI: done in 0.00452498 secs using 99.98% CPU
Optimizing NNI: done in 0.0029339 secs using 99.94% CPU
Iteration 60 / LogL: -9950.684 / Time: 0h:0m:0s (0h:0m:0s left)
Optimizing NNI: done in 0.0044384 secs using 11.15% CPU
Optimizing NNI: done in 0.0029097 secs using 99.98% CPU
Optimizing NNI: done in 0.00305426 secs using 99.96% CPU
Optimizing NNI: done in 0.00294187 secs using 99.97% CPU
Optimizing NNI: done in 0.00474686 secs using 99.98% CPU
Optimizing NNI: done in 0.00294131 secs using 99.96% CPU
Optimizing NNI: done in 0.00459606 secs using 99.98% CPU
Optimizing NNI: done in 0.00286849 secs using 99.98% CPU
Optimizing NNI: done in 0.00290451 secs using 99.98% CPU
Optimizing NNI: done in 0.00293473 secs using 99.98% CPU
Iteration 70 / LogL: -9950.684 / Time: 0h:0m:0s (0h:0m:0s left)
Optimizing NNI: done in 0.00272565 secs using 99.98% CPU
Optimizing NNI: done in 0.0044388 secs using 99.96% CPU
Optimizing NNI: done in 0.00293453 secs using 99.98% CPU
Optimizing NNI: done in 0.00295801 secs using 99.97% CPU
Optimizing NNI: done in 0.00122835 secs using 99.89% CPU
Optimizing NNI: done in 0.00453589 secs using 99.96% CPU
Optimizing NNI: done in 0.00448131 secs using 99.95% CPU
Optimizing NNI: done in 0.00129051 secs using 99.88% CPU
Optimizing NNI: done in 0.00288386 secs using 99.94% CPU
Optimizing NNI: done in 0.00289489 secs using 99.97% CPU
Iteration 80 / LogL: -9950.682 / Time: 0h:0m:0s (0h:0m:0s left)
Optimizing NNI: done in 0.0046419 secs using 99.98% CPU
Optimizing NNI: done in 0.00287811 secs using 99.96% CPU
Optimizing NNI: done in 0.00471309 secs using 99.98% CPU
Optimizing NNI: done in 0.0028877 secs using 99.98% CPU
Optimizing NNI: done in 0.00290253 secs using 99.95% CPU
Optimizing NNI: done in 0.00271488 secs using 99.93% CPU
Optimizing NNI: done in 0.00288359 secs using 99.98% CPU
Optimizing NNI: done in 0.00452118 secs using 99.97% CPU
Optimizing NNI: done in 0.00459141 secs using 99.99% CPU
Optimizing NNI: done in 0.00465544 secs using 99.97% CPU
Iteration 90 / LogL: -9950.971 / Time: 0h:0m:0s (0h:0m:0s left)
Optimizing NNI: done in 0.00452944 secs using 99.99% CPU
Optimizing NNI: done in 0.0029089 secs using 99.97% CPU
Optimizing NNI: done in 0.00288244 secs using 99.95% CPU
Optimizing NNI: done in 0.00297433 secs using 99.96% CPU
Optimizing NNI: done in 0.00290399 secs using 99.97% CPU
Optimizing NNI: done in 0.00292248 secs using 99.95% CPU
Optimizing NNI: done in 0.00272215 secs using 99.96% CPU
Optimizing NNI: done in 0.00296445 secs using 99.95% CPU
Optimizing NNI: done in 0.00274692 secs using 99.97% CPU
Optimizing NNI: done in 0.00287882 secs using 99.97% CPU
Iteration 100 / LogL: -9950.648 / Time: 0h:0m:0s (0h:0m:0s left)
Log-likelihood cutoff on original alignment: -9979.321
NOTE: Bootstrap correlation coefficient of split occurrence frequencies: 1.000
Optimizing NNI: done in 0.00475389 secs using 99.98% CPU
TREE SEARCH COMPLETED AFTER 101 ITERATIONS / Time: 0h:0m:0s
--------------------------------------------------------------------
| FINALIZING TREE SEARCH |
--------------------------------------------------------------------
Performs final model parameters optimization
Estimate model parameters (epsilon = 0.010)
1. Initial log-likelihood: -9950.621
Optimal log-likelihood: -9950.620
Rate parameters: A-C: 2.65045 A-G: 2.95002 A-T: 3.94530 C-G: 1.95408 C-T: 6.17521 G-T: 1.00000
Base frequencies: A: 0.327 C: 0.159 G: 0.195 T: 0.319
Gamma shape alpha: 0.395
Parameters optimization took 1 rounds (0.002 sec)
BEST SCORE FOUND : -9950.620
Creating bootstrap support values...
Split supports printed to NEXUS file results/supermatrix/supermatrix.cds.splits.nex
Total tree length: 1.086
Total number of iterations: 101
CPU time used for tree search: 0.605 sec (0h:0m:0s)
Wall-clock time used for tree search: 0.606 sec (0h:0m:0s)
Total CPU time used: 0.714 sec (0h:0m:0s)
Total wall-clock time used: 0.715 sec (0h:0m:0s)
Computing bootstrap consensus tree...
Reading input file results/supermatrix/supermatrix.cds.splits.nex...
6 taxa and 12 splits.
Consensus tree written to results/supermatrix/supermatrix.cds.contree
Reading input trees file results/supermatrix/supermatrix.cds.contree
Log-likelihood of consensus tree: -9950.620
Analysis results written to:
IQ-TREE report: results/supermatrix/supermatrix.cds.iqtree
Maximum-likelihood tree: results/supermatrix/supermatrix.cds.treefile
Likelihood distances: results/supermatrix/supermatrix.cds.mldist
Ultrafast bootstrap approximation results written to:
Split support values: results/supermatrix/supermatrix.cds.splits.nex
Consensus tree: results/supermatrix/supermatrix.cds.contree
Screen log file: results/supermatrix/supermatrix.cds.log
ALISIM COMMAND
--------------
--alisim simulated_MSA -t results/supermatrix/supermatrix.cds.treefile -m "GTR{2.65045,2.95002,3.9453,1.95408,6.17521}+F{0.327341,0.1587,0.195008,0.318951}+G4{0.394866}" --length 2388
Date and Time: Mon May 20 06:53:14 2024
Gene Tree
Supertree
Newick Format
(Derbesia_sp_WEST4838,(Bryopsis_plumosa_WEST4718,(Avrainvillea_mazei_HV02664,((Chlorodesmis_fastigiata_HV03865,Flabellia_petiolata_HV01202)0.87:0.6931471805599453,Caulerpa_cliftonii_HV03798)0.87:0.6931471805599453)0.87:0.6931471805599453):0.0);
Workflow
Bibliography
- 1
- Deren A. R. Eaton. Toytree: A minimalist tree visualization and manipulation library for Python. Methods in Ecology and Evolution, 11:187–191, 2020. doi:10.1111/2041-210X.13313.
- 2
- David M. Emms and Steven Kelly. Orthofinder: phylogenetic orthology inference for comparative genomics. Genome Biology, 20(1):238, 2019. URL: https://doi.org/10.1186/s13059-019-1832-y, doi:10.1186/s13059-019-1832-y.
- 3
- Diep Thi Hoang, Olga Chernomor, Arndt von Haeseler, Bui Quang Minh, and Le Sy Vinh. UFBoot2: Improving the Ultrafast Bootstrap Approximation. Molecular Biology and Evolution, 35(2):518–522, 10 2017. URL: https://doi.org/10.1093/molbev/msx281, arXiv:https://academic.oup.com/mbe/article-pdf/35/2/518/24367824/msx281.pdf, doi:10.1093/molbev/msx281.
- 4
- Subha Kalyaanamoorthy, Bui Quang Minh, Thomas K F Wong, Arndt von Haeseler, and Lars S Jermiin. Modelfinder: fast model selection for accurate phylogenetic estimates. Nature Methods, 14(6):587–589, 2017. URL: https://doi.org/10.1038/nmeth.4285, doi:10.1038/nmeth.4285.
- 5
- Kazutaka Katoh and Daron M. Standley. MAFFT Multiple Sequence Alignment Software Version 7: Improvements in Performance and Usability. Molecular Biology and Evolution, 30(4):772–780, 01 2013. URL: https://doi.org/10.1093/molbev/mst010, arXiv:https://academic.oup.com/mbe/article-pdf/30/4/772/6420419/mst010.pdf, doi:10.1093/molbev/mst010.
- 6
- Bui Quang Minh, Heiko A Schmidt, Olga Chernomor, Dominik Schrempf, Michael D Woodhams, Arndt von Haeseler, and Robert Lanfear. IQ-TREE 2: New Models and Efficient Methods for Phylogenetic Inference in the Genomic Era. Molecular Biology and Evolution, 37(5):1530–1534, 02 2020. URL: https://doi.org/10.1093/molbev/msaa015, arXiv:https://academic.oup.com/mbe/article-pdf/37/5/1530/33386032/msaa015.pdf, doi:10.1093/molbev/msaa015.
- 7
- Jacob L Steenwyk, III Buida, Thomas J, Abigail L Labella, Yuanning Li, Xing-Xing Shen, and Antonis Rokas. PhyKIT: a broadly applicable UNIX shell toolkit for processing and analyzing phylogenomic data. Bioinformatics, 37(16):2325–2331, 02 2021. URL: https://doi.org/10.1093/bioinformatics/btab096, arXiv:https://academic.oup.com/bioinformatics/article-pdf/37/16/2325/39948152/btab096.pdf, doi:10.1093/bioinformatics/btab096.
- 8
- Jacob L. Steenwyk, Thomas J. Buida, Carla Gonçalves, Dayna C. Goltz, Grace Morales, Matthew E. Mead, Abigail L. LaBella, Christina M. Chavez, Jonathan E. Schmitz, Maria Hadjifrangiskou, Yuanning Li, and Antonis Rokas. BioKIT: a versatile toolkit for processing and analyzing diverse types of sequence data. biorxiv, oct 2021. URL: https://doi.org/10.1101%2F2021.10.02.462868, doi:10.1101/2021.10.02.462868.
- 9
- Jacob L. Steenwyk, Thomas J. Buida, III, Yuanning Li, Xing-Xing Shen, and Antonis Rokas. Clipkit: a multiple sequence alignment trimming software for accurate phylogenomic inference. PLOS Biology, 18(12):1–17, 12 2020. URL: https://doi.org/10.1371/journal.pbio.3001007, doi:10.1371/journal.pbio.3001007.
- 10
- Chao Zhang, Maryam Rabiee, Erfan Sayyari, and Siavash Mirarab. Astral-iii: polynomial time species tree reconstruction from partially resolved gene trees. BMC Bioinformatics, 19(6):153, 2018. URL: https://doi.org/10.1186/s12859-018-2129-y, doi:10.1186/s12859-018-2129-y.
@article{10.1093/molbev/mst010,
author = "Katoh, Kazutaka and Standley, Daron M.",
title = "{MAFFT Multiple Sequence Alignment Software Version 7: Improvements in Performance and Usability}",
journal = "Molecular Biology and Evolution",
volume = "30",
number = "4",
pages = "772-780",
year = "2013",
month = "01",
abstract = "{We report a major update of the MAFFT multiple sequence alignment program. This version has several new features, including options for adding unaligned sequences into an existing alignment, adjustment of direction in nucleotide alignment, constrained alignment and parallel processing, which were implemented after the previous major update. This report shows actual examples to explain how these features work, alone and in combination. Some examples incorrectly aligned by MAFFT are also shown to clarify its limitations. We discuss how to avoid misalignments, and our ongoing efforts to overcome such limitations.}",
issn = "0737-4038",
doi = "10.1093/molbev/mst010",
url = "https://doi.org/10.1093/molbev/mst010",
eprint = "https://academic.oup.com/mbe/article-pdf/30/4/772/6420419/mst010.pdf"
}
@article{Emms2019,
author = "Emms, David M. and Kelly, Steven",
type = "Journal Article",
title = "OrthoFinder: phylogenetic orthology inference for comparative genomics",
journal = "Genome Biology",
number = "1",
doi = "10.1186/s13059-019-1832-y",
volume = "20",
pages = "238",
url = "https://doi.org/10.1186/s13059-019-1832-y",
year = "2019",
abstract = "Here, we present a major advance of the OrthoFinder method. This extends OrthoFinder’s high accuracy orthogroup inference to provide phylogenetic inference of orthologs, rooted gene trees, gene duplication events, the rooted species tree, and comparative genomics statistics. Each output is benchmarked on appropriate real or simulated datasets, and where comparable methods exist, OrthoFinder is equivalent to or outperforms these methods. Furthermore, OrthoFinder is the most accurate ortholog inference method on the Quest for Orthologs benchmark test. Finally, OrthoFinder’s comprehensive phylogenetic analysis is achieved with equivalent speed and scalability to the fastest, score-based heuristic methods. OrthoFinder is available at https://github.com/davidemms/OrthoFinder.",
isbn = "1474-760X",
DA = "2019/11/14"
}
@article{10.1371/journal.pbio.3001007,
author = "Steenwyk, Jacob L. and Buida, III, Thomas J. and Li, Yuanning and Shen, Xing-Xing and Rokas, Antonis",
doi = "10.1371/journal.pbio.3001007",
journal = "PLOS Biology",
publisher = "Public Library of Science",
title = "ClipKIT: A multiple sequence alignment trimming software for accurate phylogenomic inference",
year = "2020",
month = "12",
volume = "18",
url = "https://doi.org/10.1371/journal.pbio.3001007",
pages = "1-17",
abstract = "Highly divergent sites in multiple sequence alignments (MSAs), which can stem from erroneous inference of homology and saturation of substitutions, are thought to negatively impact phylogenetic inference. Thus, several different trimming strategies have been developed for identifying and removing these sites prior to phylogenetic inference. However, a recent study reported that doing so can worsen inference, underscoring the need for alternative alignment trimming strategies. Here, we introduce ClipKIT, an alignment trimming software that, rather than identifying and removing putatively phylogenetically uninformative sites, instead aims to identify and retain parsimony-informative sites, which are known to be phylogenetically informative. To test the efficacy of ClipKIT, we examined the accuracy and support of phylogenies inferred from 14 different alignment trimming strategies, including those implemented in ClipKIT, across nearly 140,000 alignments from a broad sampling of evolutionary histories. Phylogenies inferred from ClipKIT-trimmed alignments are accurate, robust, and time saving. Furthermore, ClipKIT consistently outperformed other trimming methods across diverse datasets, suggesting that strategies based on identifying and retaining parsimony-informative sites provide a robust framework for alignment trimming.",
number = "12"
}
@article{Steenwyk_2021,
author = "Steenwyk, Jacob L. and Buida, Thomas J. and Gon{\c{c}}alves, Carla and Goltz, Dayna C. and Morales, Grace and Mead, Matthew E. and LaBella, Abigail L. and Chavez, Christina M. and Schmitz, Jonathan E. and Hadjifrangiskou, Maria and Li, Yuanning and Rokas, Antonis",
doi = "10.1101/2021.10.02.462868",
url = "https://doi.org/10.1101\%2F2021.10.02.462868",
year = "2021",
month = "oct",
journal = "biorxiv",
publisher = "Cold Spring Harbor Laboratory",
title = "{BioKIT}: a versatile toolkit for processing and analyzing diverse types of sequence data"
}
@article{10.1093/molbev/msx281,
author = "Hoang, Diep Thi and Chernomor, Olga and von Haeseler, Arndt and Minh, Bui Quang and Vinh, Le Sy",
title = "{UFBoot2: Improving the Ultrafast Bootstrap Approximation}",
journal = "Molecular Biology and Evolution",
volume = "35",
number = "2",
pages = "518-522",
year = "2017",
month = "10",
abstract = "{The standard bootstrap (SBS), despite being computationally intensive, is widely used in maximum likelihood phylogenetic analyses. We recently proposed the ultrafast bootstrap approximation (UFBoot) to reduce computing time while achieving more unbiased branch supports than SBS under mild model violations. UFBoot has been steadily adopted as an efficient alternative to SBS and other bootstrap approaches. Here, we present UFBoot2, which substantially accelerates UFBoot and reduces the risk of overestimating branch supports due to polytomies or severe model violations. Additionally, UFBoot2 provides suitable bootstrap resampling strategies for phylogenomic data. UFBoot2 is 778 times (median) faster than SBS and 8.4 times (median) faster than RAxML rapid bootstrap on tested data sets. UFBoot2 is implemented in the IQ-TREE software package version 1.6 and freely available at http://www.iqtree.org.}",
issn = "0737-4038",
doi = "10.1093/molbev/msx281",
url = "https://doi.org/10.1093/molbev/msx281",
eprint = "https://academic.oup.com/mbe/article-pdf/35/2/518/24367824/msx281.pdf"
}
@article{10.1093/bioinformatics/btab096,
author = "Steenwyk, Jacob L and Buida, Thomas J, III and Labella, Abigail L and Li, Yuanning and Shen, Xing-Xing and Rokas, Antonis",
title = "{PhyKIT: a broadly applicable UNIX shell toolkit for processing and analyzing phylogenomic data}",
journal = "Bioinformatics",
volume = "37",
number = "16",
pages = "2325-2331",
year = "2021",
month = "02",
abstract = "{Diverse disciplines in biology process and analyze multiple sequence alignments (MSAs) and phylogenetic trees to evaluate their information content, infer evolutionary events and processes and predict gene function. However, automated processing of MSAs and trees remains a challenge due to the lack of a unified toolkit. To fill this gap, we introduce PhyKIT, a toolkit for the UNIX shell environment with 30 functions that process MSAs and trees, including but not limited to estimation of mutation rate, evaluation of sequence composition biases, calculation of the degree of violation of a molecular clock and collapsing bipartitions (internal branches) with low support.To demonstrate the utility of PhyKIT, we detail three use cases: (1) summarizing information content in MSAs and phylogenetic trees for diagnosing potential biases in sequence or tree data; (2) evaluating gene–gene covariation of evolutionary rates to identify functional relationships, including novel ones, among genes and (3) identify lack of resolution events or polytomies in phylogenetic trees, which are suggestive of rapid radiation events or lack of data. We anticipate PhyKIT will be useful for processing, examining and deriving biological meaning from increasingly large phylogenomic datasets.PhyKIT is freely available on GitHub (https://github.com/JLSteenwyk/PhyKIT), PyPi (https://pypi.org/project/phykit/) and the Anaconda Cloud (https://anaconda.org/JLSteenwyk/phykit) under the MIT license with extensive documentation and user tutorials (https://jlsteenwyk.com/PhyKIT).Supplementary data are available at Bioinformatics online.}",
issn = "1367-4803",
doi = "10.1093/bioinformatics/btab096",
url = "https://doi.org/10.1093/bioinformatics/btab096",
eprint = "https://academic.oup.com/bioinformatics/article-pdf/37/16/2325/39948152/btab096.pdf"
}
@article{eaton_toytree_2020,
author = "Eaton, Deren A. R.",
title = "Toytree: {A} minimalist tree visualization and manipulation library for {Python}",
volume = "11",
doi = "10.1111/2041-210X.13313",
journal = "Methods in Ecology and Evolution",
year = "2020",
pages = "187--191"
}
@article{10.1093/molbev/msaa015,
author = "Minh, Bui Quang and Schmidt, Heiko A and Chernomor, Olga and Schrempf, Dominik and Woodhams, Michael D and von Haeseler, Arndt and Lanfear, Robert",
title = "{IQ-TREE 2: New Models and Efficient Methods for Phylogenetic Inference in the Genomic Era}",
journal = "Molecular Biology and Evolution",
volume = "37",
number = "5",
pages = "1530-1534",
year = "2020",
month = "02",
abstract = "{IQ-TREE (http://www.iqtree.org, last accessed February 6, 2020) is a user-friendly and widely used software package for phylogenetic inference using maximum likelihood. Since the release of version 1 in 2014, we have continuously expanded IQ-TREE to integrate a plethora of new models of sequence evolution and efficient computational approaches of phylogenetic inference to deal with genomic data. Here, we describe notable features of IQ-TREE version 2 and highlight the key advantages over other software.}",
issn = "0737-4038",
doi = "10.1093/molbev/msaa015",
url = "https://doi.org/10.1093/molbev/msaa015",
eprint = "https://academic.oup.com/mbe/article-pdf/37/5/1530/33386032/msaa015.pdf"
}
@article{Kalyaanamoorthy2017,
author = "Kalyaanamoorthy, Subha and Minh, Bui Quang and Wong, Thomas K F and von Haeseler, Arndt and Jermiin, Lars S",
type = "Journal Article",
title = "ModelFinder: fast model selection for accurate phylogenetic estimates",
journal = "Nature Methods",
number = "6",
doi = "10.1038/nmeth.4285",
volume = "14",
pages = "587--589",
url = "https://doi.org/10.1038/nmeth.4285",
year = "2017",
abstract = "ModelFinder is a fast model-selection method that greatly improves the accuracy of phylogenetic estimates.",
issn = "1548-7105",
DA = "2017/06/01"
}
@article{Zhang2018,
author = "Zhang, Chao and Rabiee, Maryam and Sayyari, Erfan and Mirarab, Siavash",
type = "Journal Article",
title = "ASTRAL-III: polynomial time species tree reconstruction from partially resolved gene trees",
journal = "BMC Bioinformatics",
number = "6",
doi = "10.1186/s12859-018-2129-y",
volume = "19",
pages = "153",
url = "https://doi.org/10.1186/s12859-018-2129-y",
year = "2018",
abstract = "Evolutionary histories can be discordant across the genome, and such discordances need to be considered in reconstructing the species phylogeny. ASTRAL is one of the leading methods for inferring species trees from gene trees while accounting for gene tree discordance. ASTRAL uses dynamic programming to search for the tree that shares the maximum number of quartet topologies with input gene trees, restricting itself to a predefined set of bipartitions.",
issn = "1471-2105",
DA = "2018/05/08"
}
All taxa are found in orthogroups.
File(s) not Fasta or Genbank file. Suffix from file 'NC_026795-truncated.txt' is not Fasta or Genbank. File is assumed to be in Fasta format.